Skip to content

ENH: Add xlsb auto detection to read_excel and respect default options #38710

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jan 3, 2021

Conversation

lithomas1
Copy link
Member

@lithomas1 lithomas1 commented Dec 27, 2020

Note: Even though most of the reader options for read_excel only have 1 option, I think it is still useful to respect pandas.set_option.

Also, if some people have older xlrd(e.g. 1.2.0) and don't have openpyxl, they can use options to force pandas to read xlsx files with xlrd. (this is possible now, but xlrd's maintainer has made it clear he doesn't support for things like this)

@lithomas1 lithomas1 marked this pull request as draft December 27, 2020 00:55
@lithomas1 lithomas1 marked this pull request as ready for review December 27, 2020 01:12
@lithomas1
Copy link
Member Author

cc @WillAyd @TomAugspurger

@jreback jreback added the IO Excel read_excel, to_excel label Dec 27, 2020
@jreback jreback requested a review from rhshadrach December 27, 2020 22:46
Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. The behavior of respecting io.excel.{ext}.reader needs tests, including a test for the error message when it is set to a nonexistent engine.

@pep8speaks
Copy link

pep8speaks commented Dec 28, 2020

Hello @lithomas1! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-01-01 21:48:02 UTC

@lithomas1
Copy link
Member Author

FYI @rhshadrach, I'm going to be changing the behavior in this PR if both xlrd and openpyxl aren't installed.
Instead of raising, ImportError: Missing optional dependency 'xlrd'. Install xlrd >= 1.0.0 for Excel support Use pip or conda to install xlrd. (which is wrong anyways, we require xlrd>=1.2.0, this will need to be fixed in another PR)
We now raise, ImportError: Missing optional dependency 'openpyxl'. Use pip or conda to install openpyxl.

This way, we encourage users to use openpyxl not xlrd.

@lithomas1 lithomas1 requested a review from jreback December 31, 2020 04:41
@lithomas1 lithomas1 requested a review from jreback January 1, 2021 21:48
Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks @lithomas1!

@jreback jreback added this to the 1.3 milestone Jan 1, 2021
@jreback
Copy link
Contributor

jreback commented Jan 1, 2021

is there a test for this #34252, e.g. that the set option works correctly? (i see one added for an invalid value), and am pretty sure we are testing this, but just want to confirm.

@lithomas1
Copy link
Member Author

@jreback,
I don't think there is a way to know which engine read_excel uses, so I didn't add a test.

@rhshadrach
Copy link
Member

@lithomas1 read_excel is monkey patched in the tests for this purpose. Use pd.read_excel.keywords["engine"].

@rhshadrach
Copy link
Member

Ack - my previous comment was incorrect. That is of course only for which engine is passed to read_excel.

@jreback jreback merged commit 8d923c9 into pandas-dev:master Jan 3, 2021
@jreback
Copy link
Contributor

jreback commented Jan 3, 2021

thanks @lithomas1 very nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Excel read_excel, to_excel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: Auto Detect engine in read_excel pd.set_option("io.excel.xlsx.reader", "openpyxl") does not work
4 participants